Structure Cognizant Pseudo Relevance Feedback

نویسندگان

  • Arjun Atreya V
  • Yogesh Kakde
  • Pushpak Bhattacharyya
  • Ganesh Ramakrishnan
چکیده

We propose a structure cognizant framework for pseudo relevance feedback (PRF). This has an application, for example, in selecting expansion terms for general search from subsets such as Wikipedia, wherein documents typically have a minimally fixed set of fields, viz., Title, Body, Infobox and Categories. In existing approaches to PRF based expansion, weights of expansion terms do not depend on their field(s) of origin. This, we feel, is a weakness of current PRF approaches. We propose a per field EM formulation for finding the importance of the expansion terms, in line with traditional PRF. However, the final weight of an expansion term is found by weighting these importance based on whether the term belongs to the title, the body, the infobox or the category field(s). In our experiments with four languages, viz., English, Spanish, Finnish and Hindi, we find that this structure-aware PRF yields a 2% to 30% improvement in performance (MAP) over the vanilla PRF. We conduct ablation tests to evaluate the importance of various fields. As expected, results from these tests emphasize the importance of fields in the order of title, body, categories and infobox.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Recurrent Pseudo Relevance Feedback on Web Collections

Various Relevance Feedback techniques exist in Information Retrieval such as Simulated Relevance Feedback and Pseudo Relevance Feedback. In a Simulated Relevance Feedback technique a new query is reformulated based on the documents selected by the user from the top-ranked documents whereas in a Pseudo Relevance Feedback, the query is reformulated based on the assumption that N top-ranked docume...

متن کامل

Rutgers Information Interaction Lab at TREC 2005: Trying HARD

Within the structure of the TREC 2005 HARD track guidelines, we investigated the following hypotheses: H1: Query expansion using a “clarity”-based approach will increase effectiveness over baseline queries and baseline queries plus pseudo-relevance feedback; H2: Query expansion based on the Web will increase effectiveness over baseline queries and baseline queries plus pseudo-relevance feedback...

متن کامل

Pseudo Relevance Feedback Method Based On Taylor Expansion Of Retrieval Function In NTCIR-3 Patent Retrieval Task

Pseudo relevance feedback is empirically known as a useful method for enhancing retrieval performance. For example, we can apply the Rocchio method, which is well-known relevance feedback method, to the results of an initial search by assuming that the top-ranked documents are relevant. In this paper, for searching the NTCIR-3 patent test collection through pseudo feedback, we employ two releva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013